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The five papers contained in this volume were presented at the 1997 American Statistical 
Association (ASA) meeting in Anaheim, California (August 10-14). This is the fifth collection of ASA 
papers of particular interest to users of NCES survey data published in the Working Papers Series. 
The earlier collections were Working Paper 94-01, which included papers presented at ASA meetings 
in August 1992 and August 1993 and the ASA Conference on Establishment Surveys in June 1993, 
Working Paper 95-01, which included papers from the 1994 ASA meeting, Working Paper 96-02, 
which included papers from the 1995 ASA meeting, and Working Paper 97-01, which included papers 
from the 1996 ASA meeting. 

A list of SASS methodological papers and reports is included in the following pages for 
readers who wish to learn more about the Schools and Staffing Survey. 
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1. Introduction 

During the past few years the authors have been 
trying to find a methodology which produces a set 
of survey weights for which the estimates of 
number of schools, students and teachers agree for 
both a sample survey and its frame. The frame 
estimates are produced for the same time period as 
the survey collection. Of course, the “control” is 
only achieved for a specified set of cells. 

If there were only one estimate, say number of 
schools, then a simple raking procedure would do 
the job. However, we require the agreement of three 
independent estimates simultaneously. Raking does 
not converge for this problem. 

Scheuren and Kaufman (1996) found a solution 
using a generalized least squares methodology 
(GLS). Using a multivariate ratio adjustment first, 
weights less than 1 for the most part become 
minimal; and the implementation become 
manageable. The problem with our “solution” is 
that for cells not controlled for, the original weights 
quite often produce closer agreement than the GLS 
weights. In the 1996 paper, it was proposed that a 
mass imputation procedure (Kovar and Whitridge 
1995) might provide better results. 

Mass imputation is where the survey respondents 
are used as donors to impute back to the entire 
frame. If the survey respondent data were mass 
imputed to the frame for all data elements except 
the schools, students and teacher elements then the 
desired consistency would be achieved for all 
estimation cells. All weights would equal 1 and the 
survey estimates for schools, students and teachers 
would equal the frame total. 

The principal problem with this version of mass 
imputation is the difficulty in variance estimation 
for survey variables other than school, student and 
teachers. Since values are assigned to the entire 
frame, standard variance procedures produce a zero 
variance. A variance procedure that measures the 
imputation variance is required to compute the 
mass imputation variance estimates. 

Shao and Sitter (1996) proposed a methodology 
for measuring the imputation variance. It works for 
general estimates coming from any sample design 
and imputation methodology. The methodology 
calls for generating bootstrap samples of both 
respondents and nonrespondents. The original 



imputation procedure is applied to each bootstrap 
sample; and the distribution of bootstrap estimates 
is relied on for inference. 

This paper investigates the magnitude of the 
precision gains that we thought mass imputation 
promised. Additionally, a variance estimator for the 
mass imputation, motivated by the Shao and Sitter 
methodology, is developed. (One potential 
problem, for example, with the Shao and Sitter 
methodology is the assumption that probabilities of 
being a nonrespondent are equal within an 
imputation cell. With mass imputation, this need 
not necessarily be true.) 

The precision gains of mass imputation and its 
proposed variance estimator are tested through a 
simulation study. Frame variables are used, so that 
true values for all sample imputed estimates are 
known. The “respondents” or donors are 
determined using a single stage probability 
proportionate to size sample design, similar to the 
Schools and Staffing Survey. Given these 
respondents: 

(1) A mass imputation is performed and compared 
with the standard Horvitz-Thompson estimator. 

(2) Further, an estimate of the true mean square 
error (MSE) of the mass imputation estimate is 
compared to the Horvitz-Thompson variance. 

(3) Additionally, the proposed mass imputation 
variance is computed and compared to an estimate 
of its true variance. Since all sample estimates can 
be obtained exactly, estimates of the true variance 
are computed using the simple variance of the 
selected simulation sample estimates. 

(4) Finally, the case when nonrespondents are 
selected completely at random will be investigated 
by simulating a sample design following this 
assumption. 

2. NCES Applications for Mass Imputation 

The motivating application for mass imputation 
is the GLS problem described in the introduction. 
However, two other applications are possible. For 
example, NCES uses indirect estimation procedures 
to produce state private school enrollment figures. 
The estimation procedure applies an adjustment 
factor to each known private school. The 
adjustment reflects results from a survey that 
measures the number of schools missing from our 
lists. Since this is a mass imputation procedure, the 
results of this paper can be useful in the variance 
estimation of these state estimates. 

Another potential application of mass imputation 



is in the Center’s data warehousing project. Here, 
the object is to link past and current Center surveys 
across program areas. If sample surveys are mass 
imputed to their frame, then the linkage problem is 
reduced to linking the frames, eliminating the need 
to link all the sample surveys. Again, the variance 
estimator proposed here might be useful. 

3. Imputations 

3.1 Nearest Neighbor Imputation . — The nearest 
neighbor imputations used in this paper are done 
within imputation cells, after the schools have been 
sorted by the number of students per school. The 
imputation cells are state/school level/urbanicity. 
There are three school levels - elementary, 
secondary and combined; also three levels of 
urbanicity - central city, urban fringe/large town 
and rural/small town. After the file is sorted, it is 
accessed sequentially using the nearest responding 
school as the donor for a nonresponding school. For 
a particular unit i , we represent the imputed value 
for variable y, by y l4 . 

The imputation described above is used two ways 
with different file sortings before the imputations 
are determined. The first imputation sorts the file in 
ascending order within imputation cell. This will be 
referred to as ascending imputation. The second 
imputation is done by determining the imputations 
in ascending, as well as descending order. Each 
time an imputation is required, a random 50/50 
selection is used to determine which imputation is 
used in the estimate. 

3.2 Mass Imputation . — In mass imputation the 
sample weights associated with a probability 
sample s are ignored. Instead, it is assumed that the 
entire frame is in the sample, but the only units that 
respond are the units in s . Estimates are produced 
by assigning all units on the frame a weight of 1 
and using the units in s as the donors to impute all 
the other frame units. The nearest neighbor 
imputation, described above, was used in the mass 
imputation process. After the imputation, estimates 
were computed as though the entire frame 
responded. If the imputation process is “good,” then 
there may be some efficiency gains, compared to 
the usual Horvitz-Thompson estimator. 

4. Sample Selection (Mass Imputation Donors) 
Two sample designs will be used. The main 

design studied employed the square root of the 
number of teachers in a school as a PPS measure of 
size; the second (subsidiary) design selected units 
with equal probability within an imputation cell. 
The first is used to test the mass imputation 
procedures under the SASS sample design, while 
the second is used for comparison to verify the 
importance of the missing completely at random 



assumption. 

4.1 The Schools and Staffing Sample Design . - 
The Schools and Staffing Survey (SASS) is a 
stratified probability proportionate to size (PPS) 
sample of elementary, secondary and combined 
schools. The selection is done systematically using 
the square root of the number of teachers per school 
as the measure of size. State-by-school level cells 
define the stratification. Before systematic 
selection, schools are sorted to provide a good 
geographic distribution. Sample allocations are 
designed to provide reliable state estimates. In this 
simulation study, four small States were studied. 
The sample state sizes ranged from 72 to 196 
schools. The sampling rates ranged from 14 to 42 
percent of each state’s school population. 

In order to eliminate the SASS design effects 
from systematic sampling and high sampling rates, 
the simulation split each state/school level stratum 
into a number of substrata ( h ) so that exactly two 
schools are selected within each substratum with 
replacement. The original SASS sample sizes by 
state were, however, maintained. 

4.2 The Equal Probability Sample Design . — The 
Shao - Sitter variance methodology assumes that 
nonrespondents are missing completely at random. 
To test the importance of this assumption in the 
SASS setting, the sample selection procedure 
described above was modified to select each school 
in a stratum with equal probability. Within each 
state, the sample sizes again remained the same, but 
the allocation and stratification boundaries were 
altered to achieve the desired equal selection 
probabilities. Again, two units will be selected 
within each stratum with replacement. 

5. Mass Imputation Bootstrap Variances 
To generate the bootstrap variance estimator for an 
estimate 6 , the following is done: 

(1) A bootstrap samples* is generated by selecting 
2 units from s within each of the h stratum. The 
selection is done with equal probability and with 
replacement. 

(2) Thens* is sorted by the imputation cell and one 
bootstrap unit is randomly selected within each 

imputation cell and eliminated from s* . This is 
done in an attempt to produce a more unbiased 
variance estimate. The Shao - Sitter procedure does 
this by selecting n-1 units within each stratum. In 
the Shao and Sitter setting, this is appropriate since 
variability is introduced through the sampling 
mechanism. In the mass imputation setting, there is 
only an imputation variance. Therefore, the 
appropriate place to reduce the sample size seemed 
to be where the imputation process begins - the 
imputation cells. To verify that the imputation cell 



is the appropriate place to reduce the bootstrap 
sample size by one, a simulation was done, 
reducing the sample size in the stratum controlling 
the donor selection. (See Table 4) 

(3) The bootstrap mass imputation is generated by 
doing the mass imputation procedure on the 
original frame using the units determined in step 2 
as donors. 

(4) Using the results from step 3, compute the 

bootstrap estimate 0* the same way 0 is 
calculated. 

(5) Repeat steps (1) to (4) B times, producing 
bootstrap estimates 0j , j equaling 1 to B . 

(6) The simple variance of the 0* is our bootstrap 

variance estimate ( V* (0) ). 

6. Simulations 

There were 2,000 simulations performed for each 
sample design described above. Mass imputations 
and bootstrap variances are computed for each 
simulation. The estimate and analysis statistics used 
in the simulation are described below. 

6.1 Estimates . — Four mass imputation estimates 
per state are computed: 

y Me] ^^j^kPkt\ ‘>y Mel ~ ^ d ^kPke 2> 

*€// 

y M n, = 'Zs kPtm ;an<i y Mh = Y J S kPk *• 

keN *€// 

S k is the known number of students in school £ 
p ul : student proportion in school k grades pre - 
kindergarten to 3 

p u 2 : student proportion in k grades 4 to 6 
Pm , : student proportion in k grades 7 to 9 
Pm : student proportion in k grades 10 to 12 
It is assumed thatS* is known for all k and that 
only the p ’s require collection. Therefore, when k 
is not selected to be a responding unit, a nearest 
neighbor donor’s p will be applied to S k . 

Additionally, four Horvitz-Thompson estimates 
are computed within each state: 

>■.1 = ^^kS k p bi ;y t2 =^ w k s kPk‘2> 

kss kes 

yn,=Y, W k S l‘Pl°*'’ and ^ =Z >V *‘ S *^- 

kes kes 

where: w k is the inverse of the selection 

probability and s is the set of all selected schools. 

6.2 Simulated Variance and Bias Estimates . - The 
two variance estimates computed within each 
sample and averaged across samples are: 



v\y.)=i/Bf d (y: J -y:y 

7=1 

y 9 : mass imputation estimates described above, 
ylj : a bootstrap estimate of y m , 

yl : the average of the bootstrap estimates y 9j . 

Estimates of the true variance of the mass 
imputation estimate (y m ) and the Horvitz- 
Thompson estimate (j>) are provided below: 

yr(y.) = l/n it(y.,-y .,) 2 

j-1 

jp„and y. s are the value of j). for the 5 th simulation 
and the average of the y. s , respectively. 

If 

W )= i /« S &-*) 2 

y : Horvitz-Thompson estimate, 

y s : s 01 Horvitz-Thompson estimate, 

y s : average of the Horvitz-Thompson estimates. 

The bias of the mass imputation estimate ) is 
estimated by: 

Bias( y, )= y, s -y s 

7. Analysis Statistics from Simulations 
Four tables are provided at the end of this paper 
which provide summaries of our simulation results. 
Three key analytic statistics have been used: 

(1) To evaluate the imputation methodology, the 
relative bias of the estimated standard error (RBS) 
is computed. 

RBS = ( - VmJT)) / VW) 

(2) The relative precision of the mass imputation 
estimate (RPS) is given by: 

RPS = §y~ T CP. ) + Bias 2 CP. )) / Vf^CP) 

(3) The relative bias of the mass imputation 
estimate (RBE) 

RBE = (y. s -y,)/y,. 

7.1 Table 1 Overall Comparisons . — Table 1 
displays how mass imputation might work in the 
SASS setting using ascending imputations. The 
answer to the question of whether this approach is 
satisfactory is “no.” The precision of mass 
imputation relative to the Horvitz-Thompson 
clearly gives the Horvitz-Thompson estimator the 
advantage. The mass imputation estimator only 
once has a large efficiency gain over Horvitz- 
Thompson (18.4 percent). Seven times there was 
not much difference between the two estimation 
methods. In these cases, the gains ranged from - 
11.5 to +12.4 percent. On the other hand, there 
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were eight times when mass imputation had a big 
precision loss. In these latter cases, the lost was 
between -15.3 and -61.1 percent. Overall, mass 
imputations perform poorly for these estimates in 
the states tested. 

7.2 Table 1 Standard Error Comparisons . — Table 1 
also demonstrates with ascending imputations, 
through the relative bias of the standard error 
(RBS) values shown there, that our variance 
procedure underestimates the standard error. Most 
of the time the absolute bias is less than 10 percent. 
The reason for the underestimate may be the way 
the bootstrap sample sizes were reduced by one. 
Some theoretical work may be needed to provide an 
unbiased estimator. Another possible reason (See 
Table 2 below), is the inappropriateness of the 
completely missing at random assumption for the 
nonrespondents. 

7.3 Table 2 Standard Error Comparisons . — The 
purpose of table 2 is to investigate (using ascending 
imputations) the robustness of the missing 
completely at random assumption of the variance 
procedure. Comparing the relative biases of the 
standard error in tables 1 and 2 shows that the 
standard errors are reasonably close to the true 
values, but still consistent underestimate. When the 
missing completely at random assumption is 
violated (table 1), there are more large 
underestimates than when the assumption is not 
violated. This indicates the missing completely at 
random assumption is not critical in this setting, but 
cannot be completely ignored. 

7.4 Other Points from Table 2 . — Table 2 also 
demonstrates another point. Since the selection and 
allocation are designed to produce an equal 
probability sample, one might expect the Horvitz- 
Thompson estimator to be inefficient estimating 
total numbers of students. That being the case, the 
mass imputation might compensate for the 
deficiencies in the sample design and be more 
efficient then the Horvitz-Thompson estimator. 
This does not appear to be true. The mass 
imputation does much worse than the Horvitz- 
Thompson estimator five times. In those cases, the 
loss of precision ranged from -21.2 to -43.1 percent. 
The mass imputation has large precision gains five 
times. The gains ranged from +19.4 to +33.4 
percent. However, it should be noted that mass 
imputation with the equal probability design 
performed better than the unequal probability 
design. Therefore, there is some truth to the 
assertion made at the beginning of this paragraph. 

7.5 Table 3 Ascending and Descending Imputation . 
-The purpose of table 3 is to determine whether 
mass imputation might work in the SASS setting 
when both ascending and descending imputations 



are used. The answer to this question is still “no.” 
Twice the mass imputation was much better than 
Horvitz-Thompson. In these cases, the gains were 
+23.2 and +27.2 percent. Eleven times there was 
not much difference between the two estimation 
methods. In these cases, the gains ranged from -14 
to +11.2 percent. Three times, mass imputation had 
a big precision loss between -17.1 to -21.9 percent. 

Still, as might be expected intuitively, using the 
ascending/descending imputation works much 
better than just ascending imputation. This is seen 
by comparing Tables 1 and 3. Using the ascending 
imputation, mass imputation is reasonably close or 
better than Horvitz-Thompson eight times, while 
the ascending/descending is reasonably close or 
better thirteen times. One reason for this is that the 
ascending/descending imputation is generally less 
biased. Since both imputation methodologies have 
small biases, this is not the main reason for the 
difference. 

The main reason for the difference is a smaller 
variance. Now there are two ways to reduce this 
variance: (1) reduce the variability of the donor 
enrollment counts or (2) reduce the variability of 
the weights. Since donor enrollment counts are the 
same for both imputation methodologies, the 
reduction in variance is coming entirely from 
reducing the variability of the weights. Since the 
ascending/descending mass imputation imputes 
values from both sides of the missing data at an 
expected 1 to 1 rate, the implicit weight for this 
processes will be (more or less) the moving average 
of the individual ascending and descending 
imputation weights. Since the units are sorted by 
size, the weights should be increasing as you go 
down the file. Therefore, the moving average of the 
weights should be less variable than the individual 
weights. 

7.6 Table 4 On Bootstrap Bias Issues . — Table 4 
displays what happens when reducing the bootstrap 
sample sizes by one at the sampling stratum level, 
rather than at the imputation cell level as earlier. As 
can be seen in the relative bias of the standard error 
from table 4, the standard errors are overestimated 
by +24 to +48 percent. This shows that reducing the 
bootstrap sample size at the imputation cell level, 
although a slight underestimate, is better than doing 
it at the sample stratum level. Again, some work on 
the theory would seem to be needed here. 

8.0 Conclusions And Areas For Future Study 

8.1 Some Basic Conclusions . — A lot was learned 
from the simulation work discussed here: 

(1) We remain convinced, for example, of appeal of 
being able to make greater use of the frame 
variables to improve estimation; however, we now 
have a much greater appreciation of the practical 



difficulties in an actual implementation of such a 
procedure.: 

(2) Broadly speaking, for the states and variables 
used in the present analysis, the bias in the mass 
imputation estimator, no matter how conducted, is 
relatively small. This suggests that any of the 
nearest neighbor imputation variants employed here 
would be sufficient for a mass imputation process. 

(3) However, while at times the mass imputation 
estimator did outperform the Horvitz-Thompson 
estimator, it just never performed better overall. 

(4) Of the two methods of imputation employed, 
the ascending/descending method was clearly 
superior to just using the ascending method alone. 

(5) As our work on the robustness of the “missing 
at random” assumption demonstrated, the sample 
design and the imputation procedure cannot be 
treated independently. 

(6) The variance estimation procedure proposed in 
this paper seems to work reasonably well. Most of 
the time, it underestimates the variance slightly. 
Occasionally, though, the variance is greatly 
underestimated, especially when the selection 
probabilities within imputation cells are unequal. 

(7) The proposed variance procedure does not 
appear to be unbiased, although some ad hoc 
adjustments seem clearly better than others (e.g., 
adjusting at the imputation cell rather than stratum 
level). 

8.2 Some Next Steps and Second Thoughts . — 
Without question we were disappointed with the 
performance of mass imputation. While not a 
failure, so far it has not delivered on our 
expectations. Some conjectures about why: 

(1) One possible reason is that, even though the 
imputation took school size into account, there 
were not enough large schools selected in the equal 
probability design to measure the large school’s 
distribution appropriately. As noted already, users 
of mass imputation must take the sample design 
into account when determining an appropriate 
imputation. 

(2) Another possibility is that a better imputation 
procedure needs to be used. Fixing either of these 
possibilities requires designers very knowledgeable 
about the data being imputed. Clearly, just using a 
relatively efficient general imputation procedure, 
like nearest neighbor, does not guarantee good 
performance of the mass imputation estimator. 

(3) In doing the imputations, no control was 
introduced on the number of times a donor case was 
used. If done, this, all by itself, might have 
improved our results dramatically. We conjecture 
that the cases where extremely poor results were 
obtained would have been lessened. 

(4) Theoretical work on nearest neighbor 



imputation, given at these meetings, also is a place 
to look for ideas for improvements (Chen and Shao 
1997). In addition, theoretical work seems required 
to find an unbiased variance methodology. Even so, 
given the general difficulty of variance estimation 
for indirect estimates, the variance procedure 
described here may be applicable to the indirect 
estimation problem stated in section 2. We are less 
sure, by the way, about the application of mass 
imputation to NCES’s data warehousing work. 

(5) If the ascending/descending mass imputation is 
used when the donors are selected with equal 
probability, then the mass imputation may well 
outperform Horvitz-Thompson. However, there 
was not the time to do this simulation for the 
present paper. 

(7) While we started off our work determined to 
better the GLS procedures studied earlier, no direct 
comparison was made here with a comparable GLS 
estimator. We conjecture that mass imputation at 
best may be not much better than a “wash’ when 
imputing based on a single variable; but that as the 
dimensionality of the information used from the 
frame grows, mass imputation may yet show it 
value. 

Table 1 -- Relative Precision (RPS), Bias (RBS) of 
the Mass Imputation Standard Error and Relative 
Bias (RBE) of the Estimator using SASS Sample 
Design and Ascending Imputations 



State 


Est. 


Standard Error 


Estimate 






Relative 

Precision 


Relative 

Bias 


Relative 

Bias 


2 


yM* 


100.2 


- 5.7 


0.1 


9 Mel 


89.7 


-8.8 


-0.2 


9 Mm 


112.4 


8.4 


-2.0 


9ms 


88.5 


5.5 


2.7 


9 


9 Mel 


136.2 


- 17.6 


-3.3 


9 Mel 


126.0 


-18.0 


-0.1 


9 Mm 


137.3 


-16.5 


3.0 


y A* 


81.4 


-6.9 


1.9 


10 


9 Mel 


132.2 


-3.3 


-3.3 


y Mel 


115.3 


-3.8 


3.8 


9 Mm 


118.9 


3.6 


0.9 


y ms 


99.4 


11.6 


Tf 

O 

r 


24 


9 Mel 


161.1 


- 9.7 


-4.5 


9 Mel 


108.0 


-19.3 


-0.5 


y Mm 


156.1 


-9.4 


5.3 


y Ms 


104.4 


-10.8 


1.3 
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Table 2 - Relative Precision (RPS), Bias (RBS) of 
the Mass Imputation Standard Error and Relative 
Bias (RBE) of the Estimator with Equal Probability 
Selection of Donors and Ascending Imputations 



State 


Est. 


Standard Error 


Estimate 






Relative 

Precision 


Relative 

Bias 


Relative 

Bias 


2 


9Mel 


92.3 


-10.6 


0.1 


9 Mel 


72.9 


-10.0 


0.5 


y Mm 


110.7 


-5.3 


1.5 


y Ms 


80.6 


-2.2 


-2.9 


9 


9 Mel 


123.2 


-11.7 


-0.8 


9 Mel 


106.0 


-16.6 


0.0 


y Mm 


112.3 


-12.6 


-0.2 


9 Ms 


66.6 


-2.9 


1.6 


10 


9 Mel 


133.2 


-0.1 


-1.5 


y Mel 


98.5 


0.4 


2.1 


y Mm 


121.2 


-2.9 


2.0 


9 ms 


96.6 


12.4 


-3.0 


24 


yMe 1 


143.1 


-13.3 


-2.4 


9 Mel 


80.4 


-14.1 


-0.3 


y Mm 


128.7 


-13.2 


3.1 


9ms 


68.6 


-8.0 


0.8 



Table 3 — Relative Precision (RPS), Bias (RBS) of 
the Mass Imputation Standard Error and Relative 
Bias (RBE) of the Estimator using SASS Sample 
Design and Ascending and Descending Imputations 



State 


Est. 


Standard Error 


Estimate 






Relative 

Precision 


Relative 

Bias 


Relative 

Bias 


2 


9 Mel 


92.6 


-1.0 


0.9 


9 Mel 


88.8 


-6.0 


-0.3 


y Mm 


103.2 


14.7 


-1.8 


9 ms 


72.8 


21.9 


0.3 


9 


9Me 1 


109.9 


-17.6 


-1.4 


9 Mel 


110.5 


-18.3 


0.2 


y Mm 


109.1 


-15.0 


1.0 


9 ms 


76.8 


-1.9 


1.1 


10 


9 Me\ 


121.9 


0.5 


-0.1 


9 Mel 


108.2 


1.9 


0.5 


y Mm 


114.0 


3.1 


0.0 


9 ms 


94.4 


15.4 


-0.4 


24 


9 Mel 


120.3 


-13.7 


-1.4 


y Mel 


92.1 


-20.9 


-0.2 


y Mm 


117.1 


-13.4 


1.7 


y ms 


93.9 


-4.8 


0.3 



Table 4 -- Relative Bias (RBS) of the Mass Imputation Standard Error, Adjusting the Bootstrap Sample 
Size at the Donor Selection Stratum Level using Ascending Imputations 



State 


Est. 


Relative 

Bias 


State 


Est. 


Relative 

Bias 


State 


Est. 


Relative 

Bias 


State 


Est. 


Relative 

Bias 


2 


y\ie\ 


36.4 


9 


9Me 1 


24.3 


10 


9 Mel 


43.8 


24 


9 Mel 


28.9 




y Mel 


30.6 




9mc 1 


24.6 




9 Mel 


42.6 




9 Mel 


24.1 




y Mm 


30.5 




y Mm 


25.2 




y Mm 


32.2 




y Mm 


29.6 




y ms 


30.5 




y Ms 


46.0 




9 Ms 


47.6 




9 Ms 


27.5 
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THE EFFECT OF MODE OF INTERVIEW ON ESTIMATES FROM THE 1993-94 SCHOOLS 
AND STAFFING SURVEY (SASS) PUBLIC SCHOOL TEACHER SURVEY 

Comette L. Cole, Robert C. Abramson, Randall J. Panner, Dennis J. Schwanz 
Comette L. Cole, U. S. Bureau of the Census, Washington, D.C. 20233 
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I. Introduction 

To reduce the cost of data collection and to improve the 
efficiency, data was collected for the 1994 Schools and 
Staffing Survey's (SASS) Public School Teacher Survey 
by mail, telephone, and personal visits. 

During the selection of sample teachers, a split-panel 
design was used where two-thirds of the teacher sample 
was randomly assigned for mail nonresponse follow-up 
interviewing from centralized computer-assisted 
telephone interviewing (CATI) facilities and the 
remaining one-third was assigned for telephone follow-up 
interviewing from decentralized facilities. The teachers 
were randomly assigned a teacher follow-up mode flag 
value of 1, 2, or 3 as indicated below. We used this 
teacher follow-up mode designation flag to separate 
teacher records and formed the CATI and nonCATI 
treatment groups we used for this analysis. 

1 CATI 

2 nonCATI 

3 CATI (these records were initially held for possible 
sample reductions) 

Data from the 1994 Public Teacher Survey was collected 
primarily through self-administered questionnaires, where 
sample teachers completed the questionnaires and 
returned them by mail. About 69% of the total interviews 
were mail returns. 

Telephone calls were made from either CATI or 
decentralized facilities to teachers who did not return their 
questionnaires by mail. Personnel from the Census 
Bureau's Field Division were asked to determine the 
workload capacity of the centralized telephone 
interviewing (CATI) facilities. From the teachers who 
hadn't returned their questionnaires by mail, the indicated 
number of teacher records with follow-up mode 
designation flags which indicated that they had been 
designated for CATI follow-up interviewing, was sent to 
the CATI facilities to be interviewed. 



The remaining CATI-designated cases, that is, those that 
CATI couldn't handle, were sent to be interviewed by 
decentralized telephone interviewing, along with mail 
nonrespondents previously designated for this follow-up 
mode. About 19% of the total interviews were completed 
in CATI interviews and 12% in decentralized (NON- 
CATI) telephone interviews. 

A very small number of interviews, couldn't be 
interviewed through either of these telephone methods, 
and were completed during visits to schools by Census 
Bureau field representatives. 

To be certain there was no bias in survey estimates 
because we used different modes in the telephone follow- 
up of mail nonrespondents, we initiated this study to 
compare the data we collected in CATI interviews with 
those collected in decentralized telephone interviews. 

A. The Schools and Staffing Survey 

The SASS is a periodic survey sponsored by the National 
Center for Education Statistics (NCES) and conducted by 
the U. S. Bureau of the Census. The SASS provides data 
on the policies and conditions of public and private 
elementary and secondary schools, principals, libraries, 
librarians, teachers and students in the United States. 

The school, principal, library, librarian, teacher, and 
student samples were selected so that data from each of 
the components could be linked. For the 1993-94 school 
year, about 13,000 schools, 67,000 teachers, 7,600 
libraries and librarians, and 6,900 students were selected 1 
for SASS as follows: 

• Private and public sample schools were selected first. 

• All principals from SASS sample schools were in 
sample for the School Administrator Survey, 

• A sample of teachers was selected within each of the 
SASS sample schools for the Teacher Survey. 



l S . Kaufman et al. (1996). 1993-94 Schools and Staffing 
Survey: Sample Design and Estimation. NCES 96-089. 
U. S. Department of Education, Office of Educational 
Research and Improvement. Washington, D. C.: National 
Center for Education Statistics. 
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• A subsample of SASS sample schools was selected 
for the Library and Librarian Surveys. 

• And a subsample of SASS sample schools and 
teachers was selected to participate in the Student 
Record Survey. 

B. Public School Teacher Survey Sampling Procedure 

The sample of teachers for the Public School Teacher 
Survey was selected from SASS sample schools. Each 
sample school was asked to provide a list of its teachers 
with the information below for each teacher: 

• whether the teacher was new (less than three years 
experience) or experienced, 

• the teacher's race and ethnicity, 

• whether he or she was considered a Bilingual or 
English as a Second Language (ESL) teacher 

• his or her main field of teaching 

Within each sample school, sample teachers were 
classified into one of the following five strata in the 
hierarchical order listed below. For example, if a teacher 
is both API and bilingual the teacher was assigned to the 
API stratum. 

(1) Asian or Pacific Islander (API) 

(2) American Indian, Aleut, or Eskimo (ALAE) 

(3) Bilingual 

(4) New 

(5) Experienced 

Within each school and teacher stratum, teachers were 
selected with equal probability. From the lists of teachers 
provided by the schools, 56,736 public school teachers 
were selected. 

G Estimation 

The weight used to produce estimates of public school 
teacher characteristics was a product of the following 
weight and factors: 

• Basic Weight - the inverse of the probability of 
selection 

• School Sampling Adjustment Factor - an adjustment 
to the school's probability of selection to account for 
school mergers, splits, and duplicates 

• School Nonresponse Adjustment Factor - an 
adjustment to account for teachers whose schools did 
not provide a list of its teachers 



• Teacher Within School Noninterview Adjustment 
Factor - an adjustment that accounts for teacher 
nonrespondents 

• Frame Ratio Adjustment Factor - a factor which 
adjusts teacher estimates to the total universe count of 
teachers from the public school sample frame 

• Teacher Adjustment Factor - an adjustment which 
makes estimates of the weighted number of teachers 
from the SASS School and Teacher Survey consistent 

II. Methodology 

A. Estimates for the Analysis 

A flag was assigned after the interviews were completed 
to indicate the actual mode of interview. The flag to 
indicate which telephone mode should be used to follow 
up mail nonrespondents was assigned prior to the initial 
questionnaire mailing. We used the follow-up mode 
designation flag to separate interviewed teacher records 
and form a CA Tl andNON-CA Tl treatment group for our 
analysis . Therefore, the treatment name (CATI or NON- 
CATI) is not necessarily an indicator of how the 
interview was actually completed. 

The CATI treatment comprises all teachers who were 
designated for mail nonresponse follow-up interviewing 
from CATI facilities. Their interviews were actually 
either returned by mail or completed by telephone from 
CATI facilities. 

The NON-CATI treatment comprises all teachers who 
were assigned for mail nonresponse follow-up 
interviewing from decentralized facilities. Their 
questionnaires may have actually been returned by mail 
or completed in interviews from decentralized facilities. 

Recall that two-thirds of the teacher sample was assigned 
for CATI follow-up interviewing and the remaining one- 
third for telephone follow-up interviewing from 
decentralized facilities. To insure that the estimates we 
produce from records in our CATI and NON-CATI 
treatments would be approximately equal to the estimates 
we got for the entire sample, we increased the teacher 
basic weights on records in the CATI treatment by 1.5 
and those in the NON-CATI treatment by 3.0. Then, 

• We processed the CATI and NON-CATI data sets 
(separately) through the same weighting procedure 
used to weight the regular Public School Teacher 
Survey data. 

• We further separated the reweighted teacher records 



by actual mode of interview within each treatment. 

• We finally produced CATI and NON-CATI treatment 
estimates for the Public School Teacher Survey 
questionnaire items. 

We made CATI vs. NON-CATI comparisons of estimates 
for the following groups of teacher records: 

Comparison Group 1 ; All interviews (all modes) 

All teachers who were designated for telephone follow-up 
from centralized facilities (regardless of the mode the 
interview was completed in) 

those who were designated for telephone follow-up from 
decentralized facilities (regardless of the mode the 
interview was completed in) 

Comparison Group 2: Mail interviews only 

Teachers who were designated for telephone follow-up 
from centralized facilities who returned their 
questionnaires by mail 

vs* 

those who returned their questionnaires by mail who had 
been designated for telephone follow-up from 
decentralized facilities 

Comparison Group 3 : Interviews completed during 

telephone follow-up only. 

Teachers who were designated for telephone follow-up 
from centralized facilities and their interviews were 
completed in centralized telephone interviews 
YS* 

those who were designated for telephone follow-up from 
decentralized facilities and their interviews were 
completed in decentralized telephone interviews 

Our primary interest was in Comparison Group 3. For 
these respondents, the telephone follow-up mode flag and 
the flag which indicates the actual interview modes have 
the same value. 

B. Computing the Variances for the Analysis 

We used the Balanced Repeated Replication (BRR) 
method in WESVAR 2 to compute variances for each — 
estimate. The WESVAR BRR procedure uses replication 
techniques to calculate variances for estimates using: 



i G ' 

v(0> = - Q) 2 



where, 

0 = the estimator for the teacher questionnaire item 

v(0) = the variance of the estimate 
G' = the number of replicates 

The replicate weights we used to compute the variance 
estimates in WESVAR were computed using the same 
replicate factors used to calculate variance estimates for 
the regular 1993-94 SASS publication estimates. 

G Comparing Treatment Estimates 

We evaluated the magnitude of the differences between 
CATI and NON-CATI estimates to see if they were 
statistically significant. We formed the null hypothesis, 

u . q _ c\ 
o ^CATl ” NONCATI 

which says an estimate produced using the records of 
CATI teachers (0 C atl) is the same as that produced using 
the records of NON-CATI teachers (0 NO n-catl)* 

To test the hypothesis, we used the 'z' statistic: 

0 -0 

_ v/ cart *0* -cati 

yjvar@ e J + var ( 0 .o.-c„«) 

where, 

• 0 is the estimate of the teacher characteristic of 
interest, var(0) is its variance, 

• the numerator is the difference between the CATI and 
NON-CATI estimates and 

• the denominator is an estimate of the standard error of 
the difference. 

A negative value for the z meant the NON-CATI estimate 
was higher, while a positive z value meant the CATI 
estimate was higher. Results of the significance tests are 
presented in Section III. 

D. Evaluating the Distribution of the Differences 
between Treatment Estimates 



2 Westat, Inc., The WESVAR SAS Procedure , Version 
1.2 , Rockville, MD: Westat, Inc. 



Our significance tests evaluated the magnitude of the 
difference between CATI and NON-CATI estimates 



individually for each teacher questionnaire item. We 
used the sign rank test to evaluate the distribution of the 
differences across the items. 

We used the SAS PROC UNIVARIATE ? procedure to 
perform the Wilcoxon Signed-Rank Test. We assumed 
each difference was equally likely to be positive or 
negative and that the distribution of the differences is 
symmetrical. We tested the hypothesis that the median of 
the differences between the CATI and NON-CATI 
estimates is zero. The following steps are involved in the 
test: 

• The absolute values of the differences are assigned 
ranks by magnitude, from smallest to largest, then the 
positive and negative signs are restored to the ranked 
values. 

• The totals of the ranks with negative signs and those 
with positive signs are calculated. 

The Wilcoxon signed rank statistic S is computed in SAS 
as follows: 

5 = Y,r; - 



where, 

• S is a sum of scaled binomial distributions 

• r* is the rank of \Xf | after discarding values of x i =0 
and X| is the difference between the CATI and NON- 
CATI estimates (| Ocati - OnonCAn\ ) 

• n is the number of nonzero x { values and 

• the sum is computed over the values of x t greater than 
zero. 

The significance level of S is computed as: 

Significance level - S ^ j where , 

JnV-S 2 

n{n + Wn + l)-0.sJ^$JL$ l +l){t l -l) 

24 

The sum is calculated over differences tied in absolute - 
value and t f is the number of tied values with the t* 
difference. 



3 SAS Institute Inc., SAS Procedures Guide, Version 6, 
Third Edition, Cary, NC: SAS Institute Inc. 



SAS outputs a probability or p-value that is a measure of 
the strength of the evidence against the null hypothesis. 
If the p-value is less than the significance level of the test, 
which in our case is 0.10, the null hypothesis should be 
rejected. The smaller the p-value, the stronger the 
evidence for rejecting the null hypothesis. 

III. Results 

A. Tests of Significance 

At the a = .10 level of significance, we expect no more 
than 10 percent of the estimates within a group would be 
significantly different. Table 1 provides a summary of 
the results of our significance tests. The table shows that 
for all three groups, more than 10 percent of the 
comparisons yielded statistically significant results. 

Table 1 also shows that the interviews completed during 
telephone follow-up (Comparison Group 3) had the 
higher proportion of significant differences. In this 
group, we are comparing the responses of teachers who 
were actually interviewed from CATI facilities with those 
interviewed from decentralized telephone facilities. We 
see in Table 1 that there were about twice the proportion 
of significant differences between these respondents than 
mail respondents. 

B. Items with significant differences 

Most of the significant differences were between the 
responses of teachers in the two treatments to the series of 
questions labeled "Perceptions and Attitudes Toward 
Teaching" (Section E of the 1993-94 Public School 
Teacher questionnaire). 

In general, NON-CATI treatment estimates were higher 
for categories of items which have negative connotations, 
while CATI treatment estimates were higher for 
responses which suggest these teachers had a more 
positive outlook. 

The NON-CATI treatment estimate was higher for items 
which say 

• More of these teachers believed their principal did not 
enforce student rules, he did a poor job of getting 
resources, and he did not let them know what was 
expected of them. 

• More of them reported they had little influence or no 
control over the curriculum, textbooks, homework, 
and over teacher evaluations and 

• More of them said they would remain in the same 
school system, but would teach at another school the 



next year. 

One the other hand, the CATI treatment estimates were 
higher for items which suggest the attitudes of most of 
these teachers were more positive. 

• More of them reported their principals let them know 
what was expected from them and their schools* 
administrations treated them fairly and were 
supportive. 

• More of them said their principals enforced school 
rules and backed them when they needed him to. 

• More of them planned to continue teaching at the 
same school the next school year. 

Responses to the question "If you could go back to your 
college days and start over again, would you become a 
teacher or not?" summarizes the contrast in attitude 
between teachers in the two treatments. The CATI 
estimate was higher for the category \ certainly would 
become a teacher ' and the NON-CATI estimate was 
higher for the category * chances about even for or 
against'. 



for these items alone is different from zero and the 
distribution of the differences is skewed. This result 
agrees with our observation that NON-CATI treatment 
estimates were higher for these types of items. 

IV. Conclusions 

By randomly assigning teacher sample records between 
telephone modes, we gave each teacher record a chance 
of being assigned to CATI or decentralized telephone for 
follow-up. Our tests show that teachers within a 
treatment provided similar responses to attitude and 
perceptions questions and estimates of these responses 
were statistically different between treatments. 

There are two possible explanations. One is that there 
was some periodicity in way the teacher records were 
ordered. This ordering resulted in the teachers assigned to 
the same treatment having similar characteristics. 
Another is that the assignment was truly random, but we 
were unlucky in the assignment, and the results of the 
assignment are due to the natural variability between 
teachers in the treatments. 



C. Sign Rank Tests 

In Table 2 below, probability values (PR * |S|) for the 
two-tailed tests are shown. Each p-value is greater than 
10 percent, indicating that the hypothesis that the median 
of the differences between CATI and NON-CATI 
estimates is zero should not be rejected. 



There were also a higher proportion of significant 
differences between the responses of teachers in the third 
comparison group, the group with teachers interviewed 
during telephone follow-up. The majority of the 
significant differences for this group were to attitude and 
perception items, the same as we saw between CATI and 
NON-CATI respondents in the other two analysis groups. 



Table 2 also shows that the group consisting of interviews 
completed during telephone follow-up had the highest p- 
value. Thus, there is no evidence favoring the rejection 
of the hypothesis about the distribution of the differences 
for these respondents. 

Our assumption for the sign-rank test was that each 
difference was equally likely to be positive or negative 
and that the distribution of the differences was 
symmetrical. The test results say there were about an 
equal number of differences with the CATI estimate 
higher as those with the NON-CATI estimate higher. 

We stated earlier that most of the significant differences 
between the responses of teachers in the treatments were 
to attitude and perception items. Table 3 shows the two- 
tailed p-values from the sign-rank tests we performed 
using only the categories of attitude and perception items 
which have negative connotations. 

Each p-value is less than 10 percent, suggesting we 
should reject the null hypothesis. The median difference 



Also, as seen in the other two groups, CATI treatment 
respondents reported more optimistic answers and the 
NON-CATI respondents reported more pessimistic 
answers. 



Unlike the CATI and NON-CATI respondents in the 
other two groups, in the third group we isolated records 
by both designation mode and interview mode: 



Group 1 
Group 2 
Group 3 



ActuaLInterview Mode 



CATI NON-CAT I 

Treatmen t Treatment 



all interviews all interviews 
Mail Mail 

CATI Decentralized 

telephone and a very 
small number of 
CATI cases 



The increase in the proportion of significant differences 



between respondents interviewed by telephone may be 
attributable to data in the CATI group being collected by 
CATI and data in the other primarily by decentralized 
telephone interviews. This result suggests that the 
attitude data we collected in CATI and decentralized 
telephone interviews were different. 



LIMITATIONS 

We cannot attribute the differences we observed between 
the responses of teachers in the analysis groups we 
formed for this study solely to the mode in which their 
interviews were completed or to the method we used to 
assign teacher records for mail nonresponse follow-up. 
There are other errors, such as those due to estimation, 
coverage, processing, nonresponse, etc., which may have 
influenced our results. 



TABLE 1 

Questionnaire Items with Significant Differences 



Interview Mode 


Proportion of the Differences between 
CATI and NON-CATI Treatment 
Estimates that were Statistically 
Significant 


All Interviews 


18% 


Mail Returns 


14% 


Teleohone Fo11ow-ud Interviews 


29% 



TABLE 2 

Results of Sign Rank Tests — All Items 



Interview Mode 


P-value 

(PR*|S|) 


All Interviews 


0.4092 


Mail Returns 


0.3782 


Teleohone Fo11ow-ud Interviews 


0.7112 



TABLE 3 

Results of Sign Rank Tests — Negative Responses 
To Attitude and Perception Items Only 



Interview Mode 


P-value 
(PR * |S|) 


All Interviews 


0.0001 


Mail Returns 


0.0012 


Teleohone Fo11ow-ud Interviews 


0.0019 



REINTERVIEW: A TOOL FOR SURVEY QUALITY IMPROVEMENT 

Patricia Feindt, Invin Schreiner, John Bushery, U.S. Bureau of the Census 
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1. Introduction 

This paper discusses how reinterview programs can be a 
key component in efforts to improve survey data quality. 
The process of improving survey data quality is a 
continuous one and is analogous to the Plan-Do-Check- 
Act (PDCA) Cycle. This approach was first developed 
by Walter Shewhart and later referred to in Japan as the 
Deming Cycle (Sytsma). 



P-Plan 




This cycle describes the Census Bureau's efforts to 
continuously improve questionnaire design. The "Plan" 
is the cognitive research that goes into the development 
of the questions, the "Do" is the administering of the 
questionnaire in a survey setting, the "Check" is the 
reinterview evaluation, and finally, the "Act" is revising 
those questions with poor reliability. The cycle is then 
repeated in the next round of the survey to evaluate and 
further improve the revised questionnaire. 

The Schools and Staffing Survey (SASS) provides an 
example of this cycle. The SASS is an integrated set of 
surveys including the Administrator, School, and Teacher 
Surveys. The surveys measure critical aspects of teacher 
supply and demand, the composition of the administrator 
and the teacher work force, and the general status of 
teaching and schooling in public and private elementary 
and secondary schools. The National Center for 
Education Statistics (NCES) sponsors the SASS. The 
Census Bureau first conducted the SASS during the 1987- 
88 school year and again during the 1990-91 and 1993-94 
school years. 

During each of these surveys the Census Bureau also 



conducted a reinterview to measure response variance for 
the administrator, school, and teacher surveys. By 
comparing original interview and reinterview responses, 
one can obtain a measure of response variance. In each 
reinterview, the reinterviewers re-asked a subset of 
questions from the original questionnaire. The questions 
selected were critical to the survey or suspected to be 
problematic. The results inform data users of the 
reliability of the questions and identify those that are 
problematic. 

Generally, after problem questions are identified, 
cognitive research and other questionnaire design 
methods are used to make improvements. Then, a 
reinterview study in the next round of the survey can 
assess how much the revised questions improved 
reliability. The NCES and Census Bureau went through 
this process between the 1987-88 and the 1990-91 SASS 
and again between the 1990-91 and 1993-94 SASS. 

While the results shown in this paper are from the SASS, 
this process has also been used in the National Household 
Education Survey (NHES) conducted by Westat for the 
NCES. Brick, et al. (1997) used this process on two Head 
Start questions identified as problematic from the 
reinterview in the 1991 survey. The questions were 
revised and reinterviewed again in the 1993 NHES and 
the changes made resulted in more consistent responses 
than the method used in the 1991 survey. 

2.1 Reinterview Methodology 

All the SASS surveys are conducted by mail, with 
telephone follow-up of nonrespondents. In 1994, the 
Census Bureau's Computer Assisted Telephone 
Interviewing (CATI) centers conducted the telephone 
follow-up operations. 

Except for the 1994 School Survey, each of the SASS 
reinterview studies completed about 1,000 reinterviews, 
subsampled from cases completed in the original surveys. 
The 1994 School Survey completed about 550 
reinterviews. Table 2.1a shows reinterview response 
rates by year for each of the three surveys. 
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Table 2.1a SASS Reinterview Response Rates 





1988 


1991 


1994 


Administrator 


87 


94 


82 


School 


87 


91 


62 


Teacher 


75 


83 


73 



The 1988 SASS reinterviews were completed by 
telephone, no matter how the original interview was 
completed -- an imperfect replication of the original 
survey conditions. The 1991 SASS School Survey 
reinterviews exactly replicated the original interview 
modes -- mail and telephone. However, the 1991 
Administrator and Teacher reinterviews were again 
conducted entirely by telephone. In 1994 all three 
reinterview studies replicated the original interview mode 
— mail and CATI. Table 2.1b illustrates the interview 
and reinterview modes used in the three SASS surveys. 

To determine the effect of question improvements, this 
analysis compares the response variance on questions 
changed from one survey to the next. However, changes 
in reinterview methodology complicated this analysis. 
The more recent reinterview studies better replicate the 
original surveys and produce more accurate estimates of 
response variance. Research by Bushery, Brick, 
Severynse, and McGuinness (1996) indicates that mail 
interviews can yield data with lower response variance 
than CATI interviews. This result suggests that telephone 
reinterviews of mail interviews likely overstate response 
variance. To avoid methodological differences 
confounding year-to-year comparisons, this paper 
compares only similar mode reinterview estimates of 
response variance. The changes in reinterview 
methodology prevent comparisons between the 1988 
SASS and the 1994 SASS. These differences also limit 
the 1991 and 1994 comparisons in the Administrator and 
Teacher Surveys to those cases interviewed and 
reinterviewed by telephone. The shift from paper and 
pencil (PAPI) telephone interviewing in 1991 to CATI in 
1994 also may affect the comparisons between these 
surveys, but the effect on the specific questions compared 
should be minimal. The effect of the change from PAPI 
telephone to CATI appears significant in the School 
Survey, however. Table 2.1c illustrates these 

comparisons. 

This paper report results for 23 questions or subquestions 
revised after the SASS reinterview. Two were revised 
between 1988 and 1991 and 21 between 1991 and 1994. 

2.2 Analytic Methods Used 



Two statistics assess the reliability of reporting in this 
analysis: the gross difference rate and the index of 
inconsistency. 

The gross difference rate is the percentage of cases with 
different responses in the two interviews and equals twice 
the simple response variance. 



Table 2.1b SASS Interview and Reinterview Mode 



Year/Survey 

1988 

Administrator/Teacher 



Original Reinterview 
Mode Mode 

mail telephone 

telephone telephone 



1991 

Administrator/T eacher 



mail telephone 

telephone telephone 



School 



mail mail 

telephone telephone 



1994 

Administrator/ 

Teacher/ 

School 1 



mail mail 

CATI CATI 



1 The 1994 SASS School Survey reinterview excluded 
private schools interviewed by telephone. 



Table 2.1c Year-to-Year Reinterview Comparisons 



1988 versus 1991 
Administrator 
Full sample: 

Teacher 
Full sample: 



Original-mail/telephone 

Reinterview-telephone 

Original-mail/telephone 

Reinterview-telephone 



School 



No eligible questions 



1991 versus 1994 
Administrator 
Partial sample: 

Teacher 
Partial sample: 

School 
Full sample: 



Original-telephone 

Reinterview-telephone 

Original-telephone 

Reinterview-telephone 

Original-mail/telephone 

Reinterview-mail/telephone 



The index of inconsistency is a relative measure of 
response variability. In some circumstances, it estimates 
the proportion of the total variability due to random 
response error. Foreman and Schreiner (1991) give a 
more detailed discussion of these statistics. Table 2.2 
shows the general format of the possible reporting 
outcomes from the original interview and the reinterview. 
The gross difference rate and index of inconsistency, 
formulated using the cells of this table, can be expressed 
as percentages, 

gdr(%) = 100 (b+c) /n and 

index(%) =gdr(%)/ (P Q (1-P r ) + P,(1-PJ) 9 

where P a = (a+c) /n and P r = (a+b) /n. 

Table 2.2 General Format of Interview- 
Reinterview Results 



Number of cases in 
Original Interview 



Reinterview 


With 


Without 


Total 


characteristic 


characteristic 




With 

characteristic 


a 


b 


a+b 


Without 


characteristic 


c 


d 


c+d 


Total 


a+c 


b+d n = 


a+b+c+d 



The gross difference rate and the index of inconsistency 
apply to dichotomous questions. Each response category 
of "mark all that apply" questions is treated as a separate 
dichotomous variable. Finally, the aggregate gross 
difference rate and the aggregate index of inconsistency 
measure response variance in polytymous questions. The 
aggregate gross difference rate is the percentage of all 
cases reporting different responses in the two interviews. 
The aggregate index may be regarded as a weighted 
average of indexes across all categories of a question. 
U.S. Bureau of the Census (1985) describes these 
statistics in more detail. 

All observations with missing responses to either the 
original or the reinterview were excluded from the 
analysis. Items with too few observations to estimate the 
index of inconsistency reliably also were excluded. The 
individual estimates of the index and the gross difference 
rate were compared using the Z-test. All comparisons 
were tested for significance at the 0.10 level. 



3.1 A Comparison of the 1988 and 1991 Teacher 
and Administrator Surveys 

We compare two problematic 1988 teacher subquestions, 
"Bachelor’s" and "Master’s" from the question, "Which 
of the following college degrees have you earned?" This 
question was revised and reinterviewed again in 1991. 
The new 1991 revised question showed significant 
improvement in response variance. The 1988 question 
provided a list of possible degrees and asked the 
respondent to "mark all that apply.” In 1991 two 
"yes/no" questions, "Do you have a bachelor’s degree?" 
and "Do you have a master’s degree?" were asked, with 
a "mark all that apply" question for the remaining degrees 
(associate, doctorate, etc.). The response variance was 
substantially reduced for the two "yes/no" 1991 questions. 
However, the items that remained "mark all that apply" 
showed no improvement. 

Administrators also were asked about "degrees earned." 
The revised 1991 "yes/no" questions showed similar 
improvement as the 1991 Teacher Survey. Table 3.1 
shows the index of inconsistency and the gross difference 
rate (GDR) for these questions, for both the teacher and 
administrator surveys. A Z-test at significance level 0.10 
shows the revised 1991 question has lower response 
variance than the 1988 question. Bushery et al. (1992) 
discuss these questions in more detail. 

3.2 A Comparison of the 1991 and 1994 
Administrator and Teacher Surveys 

Five of the seven Administrator and Teacher questions 
revised and reinterviewed in 1 994, improved significantly 
from 1991. The problematic "mark all that apply" 
question identified in the 1991 Administrator Survey, 
"What other school positions, if any, did you hold before 
you became a principal?" provided a list of six positions 
for administrators to choose from. Table 3.2 shows this 
list of positions. 

The field interviewers were instructed to read the six 
positions, pausing after each, and mark all that applied. 
If the response is "yes" to "Other Specify, "then they were 
to fill in the response. If the respondent answered "no" to 
all six positions, then the "None" box was to be marked. 

The analogous 1994 question had two parts. First CATI 
interviewers asked "Did you hold any other school 
positions BEFORE you became a principal? ’’. If the 
respondent answered "yes," the next question the CATI 
interviewers asked was, “Did you hold the position of 
? ” A list of positions was provided with "yes/no" 
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boxes. The CATI interviewer had to mark the "yes" box 
or "no" box. 

The 1994 revised questions divided the position, 
"Department Head and Curriculum Coordinator” into 
two separate positions. In addition, a new position, 
"Library Media Specialist" was added. 

Four of the six 1991 subquestions (i.e., positions), 
showed significant improvement in their indexes in 1 994 
when the "yes/no" format was used. Five showed 
significant improvement in their GDRs. We hypothesize 
this improvement is due to the fact that changing the 
"mark all that apply" question to a series of "yes/no" 
questions forced the CATI interviewers to ask each 
position individually. In 1991 the interview was 
performed by paper and pencil (PAPI) and did not force 
the interviewers to ask each position individually. Table 

3.2 provides response variance results for each 
subquestion. 

The response variance for the 1994 teacher question, 
"What Type of Certificate do you hold in this field?" 
showed no improvement. This question was revised in 
1994 by adding four more categories. The aggregate 
index was 39.5 in 1991 and 55.9 in 1994 (z = 1.4). 

3.3 A Comparison of the 1991 and 1994 School 
Surveys 

We compare the 1991 School Survey question, "For what 
grade levels does this school offer instruction?" to the 
1994 question, "How many students were enrolled in 
each of the grades shown on the front page, plus any 
ungraded levels, around the first of October?" Both 
questions provided a list of grades ranging from 
"ungraded” to "12th grade." We analyzed 14 
subquestions. 

The instructions were revised slightly in 1994 by asking 
respondents not to include prekindergarten, 

postsecondary or adult education students. The 
instruction also asked the respondents to refer to their 
'official fall report.' 

In the CATI portion of the sample, ten of the 14 questions 
showed significant improvement in their indexes. 
Thirteen showed significant improvement in their GDRs. 
See table 3.3 for the results from the CATI portion of the 
sample. 

In the mail portion of the sample, one of the 14 
subquestions, "l 2th Grade, " significantly worsened in 
1994 from having an index of 2.3 (1.1, 4.8) in 1991 to 7.3 



(4.0, 13.5) in 1994 (Z= -1.9) and the rest showed no 
significant difference. See table 3.4 for results from the 
mail portion of the sample. Further, the responses to the 
"grade level offered" and "how many students enrolled" 
subquestions were not completely consistent. Often 
respondents failed to mark "grade level offered," but then 
reported a number of students enrolled. Sebron (1997) 
examined three grade levels: ungraded, fourth, and tenth. 
He found that between 37 and 49 percent of the mail 
respondents reported students enrolled in these grades, 
but failed to indicate that those grades were offered. 
Fortunately, the "grade level offered" information for 
these cases are taken care of during a consistency edit. 
The CATI part of the sample did not experience this 
inconsistency. The CATI instrument forced interviewers 
to answer the "grade level" subquestion before entering 
the number of students enrolled. 

4. Conclusions 

A series of "yes/no" questions almost always generates 
more reliable responses than a single "mark all that 
apply" question. The payoff in reliability provides some 
assurance that respondent burden is worth the extra effort. 
This result supports other work that suggest data quality 
is better when a series of "yes/no" questions are asked. 
Rasinski found that item nonresponse is lowered when 
individual "yes/no" questions are asked, rather than a 
"mark all that apply" question (Rasinski, Mingay, 
Bradbum, 1994) 

The PDCA cycle has achieved some success in improving 
data quality in the SASS. This success has been limited 
because few questions have been reinterviewed a second 
time after being assessed by cognitive methods. Census 
and the NCES should develop a more comprehensive 
plan for continual questionnaire improvement. 



This paper reports the general results of research 
undertaken by Census Bureau staff. The views expressed 
are attributable to the authors and do not necessarily reflect 
those of the Census Bureau. 
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Table 3.1 Summary of the Reinterview Reliability for the 1988 and 1991 Administrator and Teacher Survey 

Which of the following degree Index of Inconsistency Gross Difference Rate (%) 

have you earned? 

1988 



Teacher 

Bachelors Degree 79.5 

Masters Degree 8.9 

Administrator 

Bachelors Degree 98.5 

Masters Degree 49.4 

- too few cases to reliably estimate the index. 

♦ significant at 0.10 alpha 



1991 


Z(diff) 


1988 


1991 


Z(diff) 




NA 


7.5 


0.6 


6.8* 


2.2 


3.8* 


4.3 


1.1 


3.7* 


_ 


NA 


20.3 


1.3 


14.5* 


11.3 


6.7* 


9.9 


1.7 


7.4* 



Table 3.2 Summary of the Reinterview Reliability of the 1991 and 1994 Administrator Surveys 



What other school positions Index of Inconsistency Gross Difference Rate (%) 

did you hold before becoming 
a principal? 





1991 


1994 


Z(diff) 


1991 


1994 


Z(diff) 


Dept Head or 
curriculum coordinator 


61.1 


26.5 


3.4* 


23.5 


13.2 


1.8* 


Assist Principal or 
program director 


29.4 


23.4 


0.7 


14.7 


8.6 


1.7* 


Guidance Counselor 


36.1 


23.1 


1.0 


7.6 


5.3 


0.8 


Athletic Coach 


45.0 


14.4 


3.3* 


16.5 


6.6 


2.8* 


Sponsor for Student clubs 


83.1 


31.7 


4.5* 


31.2 


15.9 


3.3* 


Other - Specify 


94.6 


57.2 


3.7* 


58.5 


24.5 


6.6* 



* significant at 0. 10 alpha 
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Table 3.3 Summary of the Reliability of the 1991 and 1994 Schools Survey of the Telephone/C ATI Sample 



For what grade levels does this 
school offer instruction? 


Index of Inconsistency 
1991 1994 Z(diff) 




Gross Difference Rate (%) 

1991 1994 Z(diff) 


Ungraded 


79.1 


41.5 


2.1* 




11.1 


8.6 


0.9 


Kindergarten 


16.3 


2.3 


3.1* 




8.5 


1.1 


3.5* 


1st grade 


17.0 


3.5 


2.9* 




8.5 


1.7 


3.1* 


2nd grade 


16.3 


3.5 


2.8* 




8.1 


1.7 


3.0* 


3rd grade 


17.8 


3.5 


3.0* 




8.9 


1.7 


3.3* 


4th grade 


16.3 


4.6 


2.5* 




8.1 


2.3 


2.7* 


5 th grade 


16.3 


4.6 


2.5* 




8.1 


2.3 


2.7* 


6th grade 


16.4 


4.9 


2.5* 




8.1 


2.3 


2.7* 


7th grade 


16.0 


9.4 


1.3 




7.8 


4.0 


1.7* 


8th grade 


16.7 


9.3 


1.4 




8.1 


4.0 


1.8* 


9th grade 


11.1 


3.7 


1.9* 




5.6 


1.7 


2.2* 


10th grade 


9.0 


4.9 


1.1 




4.4 


2.3 


1.2* 


1 1th grade 


9.0 


2.4 


1.9* 




4.4 


1.1 


2.0* 


12 th grade 


7.5 


2.4 


1.6 




3.7 


1.1 


1.7* 



* significant at 0.10 alpha 



Table 3.4 Summary of the Reliability of the 1991 and 1994 Schools Survey of the Mail/Mail Sample 



For what grade levels does this Index of Inconsistency 
school offer instruction? 



Ungraded 
Kindergarten 
1st grade 
2nd grade 
3rd grade 
4th grade 
5th grade 
6th grade 
7 th grade 
8th grade 
9th grade 
10th grade 
11th grade 
12th grade 

* significant at 0. 10 alpha 



1991 


1994 


Z(d iff) 


49.9 


34.9 


0.8 


5.7 


2.8 


1.3 


5.7 


5.5 


0.1 


4.8 


2.8 


0.9 


5.7 


4.6 


0.4 


6.1 


7.3 


0.4 


5.2 


5.5 


0.1 


4.8 


6.6 


0.6 


3.6 


6.5 


1.0 


4.0 


5.5 


0.6 


4.1 


7.0 


1.0 


4.3 


6.1 


0.6 


2.8 


6.2 


1. 1 


2.3 


7.3 


1.9* 



Gross Difference Rate (%) 



1991 


1994 


Z(diff) 


6.5 


7.3 


0.4 


2.8 


1.4 


1.3 


2.8 


2.8 


0.0 


2.4 


1.4 


1.0 


2.8 


2.3 


0.4 


3.0 


3.7 


0.5 


2.6 


2.8 


0.2 


2.4 


3.0.2 


0.6 


1.7 


3.2 


1.1 


2.0 


2.8 


0.6 


2.0 


3.2 


0.9 


2.8 


0.6 


0.6 


1.3 


2.8 


1.1 


1.1 


3.2 


1.1 
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I. Purpose of this Presentation 

This paper discusses the "traditional” universe of private 
elementary and secondary schools developed by the 
Census Bureau for the National Center for Education 
Statistics or NCES. This universe was initially developed 
in 1987, and subsequently updated five (5) times, with the 
sixth update currently in progress. Results of earlier 
updates have been previously reported, so this 
presentation focuses on the most recent updates, in 1995. 
Key results of the updates, including an analysis of the 
sources of added schools, their characteristics, and the 
impact of the adds on the universe will be discussed. 
Additionally, development of the Kindergarten-Terminal 
or K-terminal frame and results of the capture/recapture 
analysis will be discussed. 

II. Background 

As background, it is useful to provide definitions that 
pertain to this paper. 

A. K-Terminal 

A K-Terminal school contains an educational program 
primarily for 5-year-old children who will enter first 
grade in the upcoming school year. This includes 
transitional kindergartens and/or first grades if these 
children are expected to enter first grade upon completing 
these programs. Some of these K- Terminal programs 
may contain nursery or preschool age children. The 1995 
PSS estimated approximately 7,300 private K-Terminal 
schools in the nation. 

B. Private School Universe 

It is useful to review the definition of the private school 
universe and how it is used. The private school universe 
is defined as including all schools that provide 
educational services for at least one of grades 1-12, have 
one or more teachers, are not administered by a public 
agency, and are not operated in a private home. 

The private school universe is used in two major data 
collection efforts: 



1) First, all of the schools on this universe are 
included in the Private School Survey or PSS. 
PSS is a census of private elementary and 
secondary schools conducted bi-annually for 
NCES beginning with the 1989-90 school year. 
PSS has a two-fold purpose. 

a) First, it generates bi-annual data on the 
total number of private schools, along 
with the number of students, teachers, 
and graduates at these schools. 

b) Second, the results it generates are 
used to build an accurate and complete 
list of private schools for NCES to use 
for other private school surveys. 

The 1995 PSS estimated that there are 27,686 
private elementary-secondary schools in the 
nation. 

2) The second major data collection effort using 
this universe is the Schools and Staffing Survey 
or SASS. SASS selects a sample of 
approximately 3,500 private schools from the 
private school universe. 

It is also useful to discuss the methodology for compiling 
and updating the K-Terminal universe and the traditional 
private school universe. 

C. Traditional Private School Universe 

The traditional private school universe consists of two 
coverage improvement operations - List Frame updating 
and an Area Search Frame. List Frame updating is a 
national coverage improvement operation designed to 
locate private elementary and secondary schools not 
already on the existing private school universe. The 
updating operation uses lists from private school 
associations, the 50 states and Washington, D.C., and 
private vendors. Area Search Frame updating is a 
coverage improvement operation consisting of an 
independent search in a nationally representative sample 
of counties. This operation is used to locate private 
schools still missing from the private school universe 
resulting after list frame updating. 

As mentioned earlier, the private school universe was 
initially developed in 1987 with Quality Education Data 
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Incorporated (QED) providing us with a list of private 
elementary and secondary schools. List Fram e updating 
was the first step in improving the coverage of this 
universe. For this update, 22 of the largest private school 
associations in the country were contacted and their lists 
of schools were requested. These lists were matched to 
the QED list and eligible non-matched schools were 
added to the universe. 

The next step in improving the coverage of the first 
Private School Universe was area frame updating . For 
this update, a national sample of 75 PSUs was selected 
and field representatives were instructed to use up to ten 
(10) different sources such as the Yellow Pages to create 
an independent list of all private elementary and 
secondary schools in these sample areas. These 
independent lists were matched to the universe resulting 
from the list frame updating within each of the sample 
PSUs. The in-scope schools that did not match were 
weighted up to represent the schools that were missing 
from the updated list frame. 

Since the initial development and updating of the private 
school universe in 1987, the universe has been updated 
every two years. In 1989, the List Frame updating was 
done using only 12 association lists due to budget 
constraints. We picked the lists based on the following 
criteria: 

a) not too large 

b) had a significant difference in the total number 
of schools reported between 1987 and 1989. 

The Area Frame Updating in 1989 was done in a sample 
of 120 PSUs. 

For updates in 1991, 1993, 1995, and 1997 we used many 
more lists for the list frame updating. These updates 
included lists from as many as 44 private school 
associations, the 50 states and Washington, D.C., QED 
and Josten's Education Data. For the area frame, we 
continued to use sets of 120 PSUs as we did in 1989. 

D. K-Terminal Private School Universe 

In 1993-94, we began to collect information on K- 
Terminal school programs and build a K-Terminal frame. 
As lists were collected from the 50 states and 
Washington, D.C., and Associations for the 1993 list 
frame updating and from the sample PSUs for the 1993 
area frame updating, we began to identify and separate 
those programs that indicated that they contained at most 
a kindergarten or were primarily for 5-year-old children. 



In 1995, the K-Terminal updating again consisted of a list 
frame updating and an area frame updating operation. In 
addition to what was done in 1993, more of an effort was 
made to contact states or other alternative private 
organizations to specifically ask for a list of their private 
kindergartens. This was added to the 1995 operation to 
evaluate alternative sources for lists of kindergartens and 
to improve coverage of schools containing kindergartens. 

The results of the 1995 updating operations will now be 
presented. This analysis is done separately for the list 
frame updating, the area frame updating, and the K- 
Terminal operations. The 1 995 results from both the list 
frame and area frame are contrasted with the 1993 results. 
Note that unless otherwise stated the results for 1995 are 
similar to those for 1993. 

III. List Frame Updating Analysis 

In 1995, we added about 2,400 in-scope schools to the 
universe during the "traditional" list frame updating. The 
corresponding 1993 number was 2,300. 

In terms of the sources of the adds, 62% came from the 
state lists and 38% came from the association lists. 

Overall, the state lists were most effective with a total of 
about 1,500 adds. As might be expected, the list from the 
state of California provided the largest number of adds 
(about 25% of the total of the state list adds). The next 
nine (9) states provided another 43% of the adds, such 
that the top 1 0 states accounted for about 2/3 of the state 
adds in 1995. 

The list from Arkansas was the most effective list since 
about 16% of the schools on the list were in-scope adds. 
The next three most effective state lists had effectiveness 
rates above 12%. They were Tennessee, Montana, and 
Georgia. 

The four least effective state lists were from Kansas, 
Iowa, North Dakota, and Oklahoma with no adds. 

Association Lists were also effective, adding about 900 
schools. We are not able to do any further detailed 
analysis of adds from association lists because the 
information was lost. 

The characteristics of the adds will now be presented. 
Regarding the religious orientation of the added schools: 

• 57% of schools were Other Religious schools 

• 39% of schools were Nonsectarian schools 



• 4% of schools were Catholic schools 
In terms of the grade level of the added schools: 

• 49% of schools were Elementary schools 

• 40 % of schools were Combined schools 

• 1 1% of schools were Secondary schools 

In terms of the size of the added schools, we see that 
schools added from the list frame were predominantly 
small schools contributing 69%. The next largest schools 
contributed 18%, and the larger schools contributed at 
most 7%. 

In terms of the percent of minority students at the added 
schools, we see that schools with the lowest minority 
percentage contributed the most to the list frame with 
35%. Schools with the next highest minority percentage 
contributed 23%, and schools with the higher minority 
percentage contributed 32%. 

In terms of the school type of the added schools, more 
than half (58%) are regular elementary/secondary 
schools. Each of the other school types contribute at most 
18%. 

Looking at the impact on the universe estimates, we find 
that, overall list frame adds represented: 

• 8% of schools on the universe 

• 3% of students on the universe 

• 4% of teachers on the universe 

• 1 % of graduates on the universe 

These percentages were close to what they were in 1993 
with the exception of graduates where the impact on the 
universe was 3% in 1993. 

The impact varied considerably for the religious 
orientations and showed that the list frame updating had 
a substantial impact on improving coverage of 
Nonsectarian and Other Religious schools and very little 
impact on Catholic schools. 

• Nonsectarian schools led the way with 15% 
impact 

• Other Religious schools followed with 10% 
impact 



• Catholic schools had a minimal 1% impact 

The impact for the school grade levels showed less 
variation and indicated that the list frame updating had an 
impact on improving the coverage for all 3 grade levels. 

• Combined schools led the way with a 12% 
impact 

• Secondary schools followed with a 9% impact 

• Elementary schools were next with a 7% impact 

The impact varied considerably for the different sized 
schools. An inverse relationship exists between the size 
of school and the size of this impact. The smallest 
schools had a 19% impact and the largest schools had a 
1% impact. 

The impact varied only slightly for schools with different 
percent of minority students. 

In terms of the impact of schools of different types, we 
see that Voc. Tech., Montessori, and Alternative schools 
had at least a 24% impact whereas the other 4 had at most 
a 16% impact. This is somewhat different than what it 
was in 1993. 

IV. Area Frame Updating Analysis 

In 1995, we identified a weighted estimate of 2,386 in- 
scope area frame schools during the updating. The 
corresponding 1993 number was 2,026. 

The characteristics of the adds will now be presented. 
Regarding the religious orientation of the added schools: 

• 62% of schools were Other Religious schools 

• 35% of schools were Nonsectarian schools 

• 3% of schools were Catholic schools 
In terms of the grade level of the added schools: 

• 47% of schools were Elementary schools 

• 49 % of schools were Combined schools 

• 4% of schools were Secondary schools. 

In terms of the size of the added schools, we see that 
schools added from the area frame were predominantly 
small schools contributing 77%. The next largest schools 



contributed 12%, and the larger schools contributed at 
most 5%. 

In terms of the percent of minority students at the added 
schools, we see that schools with the lowest minority 
percentage contributed the most to the area frame with 
37%. Schools with the next highest minority percentage 
contributed 29%, and schools with the higher minority 
percentage contributed 20%. 

In terms of the school type of the added schools, we see 
that 8 out of 10 schools are either regular 
elementary/secondary or Alternative schools. Each of 
the other types contribute at most 7% each. 

The characteristics of the area frame adds were somewhat 
similar to those of the list frame adds for Religious 
Orientation, Grade Level, Enrollment, Percentage of 
Minority Students, and Type of School. 

Looking at the impact of these adds on the universe 
estimates, we find that, overall, area frame adds 
represented: 

• 8% of schools on the universe 

• 3% of students on the universe 

• 4 % of teachers on the universe 

• 1% of graduates on the universe 

The impact varied considerably for the religious 
orientations and showed that area frame updating had a 
substantial impact on improving coverage of 
Nonsectarian and Other Religious schools and very little 
impact on Catholic schools. 

• Nonsectarian schools led the way with 13% 
impact 

• Other Religious schools followed with 11% 
impact 

• Catholic schools had a minimal 1% impact 

The impact for the school grade levels showed less 
variation and indicated that area frame updating had an 
impact on improving the coverage for all 3 grade levels. 

• Combined schools led the way with a 14% 
impact 

• Secondary schools followed with a 7% impact 



• Elementary schools were next with a 4% impact 

The impact varied considerably for the different sized 
schools. An inverse relationship exists between the size 
of school and the size of this impact. 

The impact of schools with different percent of minority 
students varied only slightly. 

In terms of the impact of schools of different types, we 
see that Alternative and ECC/Daycare schools have a 
combined 60% impact whereas the others have at most a 
16% impact each. This is somewhat different than what 
it was in 1993. 

The characteristics of the area frame adds were somewhat 
similar to those of the list frame adds for Religious 
Orientation, Enrollment, and Percentage of Minority 
Students. 

V. K-Terminal Updating Analysis 

Regarding the religious orientation of the added schools: 

• 74% of schools were Other Religious schools 

• 25% of schools were Nonsectarian schools 

• 1% of schools were Catholic schools 
In terms of the grade level of the added schools: 

• 30% were Kindergarten only 

• 70% were Kindergarten and less 

Looking at the impact of these K-Terminal adds on the 
universe estimates, we find that, overall, these adds 
represented: 

• 41% of schools on the K-Terminal universe 

• 34% of students on the K-terminal universe 

• 32% of teachers on the K-Terminal universe 

The impact varied somewhat for the religious 
orientations and showed that the K-Terminal updating 
had a substantial impact on improving coverage of all 3 
religious orientations. 

• Nonsectarian schools led the way with 43% 
impact 



